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ABSTRACT 


New  methods  for  scaling  square,  nonnegative  matrices  to  dou¬ 
bly  stochastic  form  are  described.  .A  generalized  version  of  the 
convergence  theorem  in  S1NKH0R.\'  and  KNOPP  [1967]  is  proved 
and  applied  to  show  convergence  for  these  new  methods.  Tests 
indicate  that  one  of  the  new  methods  has  significantly  better  aver¬ 
age  and  worst-case  behavior  than  the  Sinkhorn.'Xnopp  method;  for 
one  of  the  3x3  examples  in  MARSHALL  and  OLKj.N  [1968],  SK 
requires  130  times  as  many  operations  as  the  new  algorithm  to 
achieve  row  and  column  sums  1  -  10"®. 


1.  latroductloQ. 

We  seek  an  algorithm  which  will  find  a  pair  o.^  po.sitive  diagonal  matrices  .0 
and  E  for  a  given  square  nonnegative  matrix,  A.  such  that  DAE  is  doubly 
stochastic— or  determine  that  such  p.iir  does  ,'iot  exist. 

A  nonnegative  n  xn  matrix  A  is  said  to  have  support  if  it  possesses  a  posi¬ 
tive  diagonal;  A  has  total  support  if  A  ■«'  0  and  every  positive  entry  in  A  lies  on  a 
positive  diagonal.  A  is  fully  indecomposable  if  it  is  impossible  to  find  permuta¬ 
tion  matrices  P  and  Q  so  that: 


with  A I  square. 
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Three  procedures  for  computing  D  and  E,  when  they  exist,  have  appeared 
in  the  literature:  (l)  minimize  Ay  subject  to  the  constraints 
n  ^  =  n  Vt  =  (2)  minimize 


/(ii.  ,j:„)  = 


n 

ibsl  =  l 
n 

n 

k  =  l 


subject  to  the  constraints 


Xjc  >  0  k  =  1,  ■■■  ,n  and  ^  =  1 


and  (3)  compute  D  and  E  iteratively  by  alternately  normalizing  all  rows  and  all 
columns  in  A.  The  first  method  is  due  to  MARSHALL  and  OLKIX  [1963];  the 
second  is  described  in  DJOKVIC  [1970].  In  each  case,  the  minimization  problem 
is  shown  to  have  a  solution  when  A  is  fully  indecomposable.  The  third  algorithm 
was  first  described  by  DEMING  and  STEPHAN  [1940]  who  called  it  the  "Iterative 
Proportional  Fitting  Procedure"'.  It  was  rediscovered  by  SI.NKHORN 
[1962,1964,1966,1967],  and,  SINKHORN  and  KXOP’’  [1967]  proved  that  D  and  E 
exist  so  that  D  A  E  is  doubly  stochastic  if  and  only  if  4  possesses  total  support. 
Further,  they  showed  that  in  such  a  case,  the  iteration  converges  to  a  solution 
pair  D  and  E.  BRUALDI,  PARTER,  aind  SCHNEIDER  [1966]  independently  proved 
the  existence  of  D  and  E  when  A  is  a  direct  sum  of  fully  indecomposable 
matrices  by  showing  that  its  corresponding  Menon  operator  (MENO.N  [1967])  has 
a  fixed  point.  P’inally,  SINKHORN  [1966]  showed  that  the  Sinkhorn/Knopp 
method  converges  geometrically  for  positive  starting  matrices. 

The  following  result  can  be  applied  to  show  that  a  nonnegative  matrix  has 
total  support  if  and  only  if  it  is  a  direct  sum  of  fully  indecomposable  matrices: 


The  Probenius  —Kanig  Theorevn.  (MARCUS  and  .MINC  [l964],p  97) 

A  nonnegative  nxn  matrix  without  support  contains  an  s  x  f  zero  subma¬ 
trix  where: 


s  +  f  =  n  +  1 
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In  this  paper  we  describe  three  new  iterative  procedures  for  scaling  nonne¬ 
gative  matrices  to  doubly  stochastic  form.  We  prove  a  generalized  version  of  the 
convergence  theorem  in  SINKHORS  and  K\OPP  [1967]  and  apply  it  to  show  that 
for  starting  matrices  with  total  support,  these  new  iteratio.ns  converge  to  diago¬ 
nally  equivalent  limits  which  eire  multiples  of  doubly  stochastic  matrices.  In  the 
final— and  most  interesting— section,  we  present  results  of  tests  comparing  our 
new  methods  to  the  Sinkhorn/Knopp  method  (SK).  One  of  the  new  algorthms, 
EQ,  exhibited  significantly  better  average  and  worst-case  behavior  than  SK;  for 
some  test  matrices,  SK  required  130  times  as  maan  operations  as  EQ  (where  an 
operation  is  a  multiply  or  a  divide)  and  exaunples  for  which  EQ  requires  more 
than  ten  times  as  maaiy  operations  as  SK  are  rare. 

Techniques  for  seeding  to  doubly  stochastic  form  have  a  number  of  applica¬ 
tions.  The  problem  that  launched  Sinkhorn’s  research  was  estimating  the  tran¬ 
sition  matrix  in  a  Markov  chain.  MARSH.'\LL  and  OLKIX  [1979]  contains  refer¬ 
ences  for  other  statistical  applications.  They  can  be  applied  to  equillibrate  a 
general  matrix  with  respect  to  any  p-norm,  p  one  of  us  has  used  EQ  to  test 
for  diagonal  equivalence  to  orthcgcnal  form  by  equillibreting  v.-ith  respect  to  the 
2-norm.  Finally,  we  rem.ark  that  doubly  stochastic  matr’>es  possess  the  follow¬ 
ing  interesting  properties;  (l)  They  are  "perfectly  bal^ocud"  •.•.•ith  .'•cspcct  to 
the  1-norm  (  see  PARLETT  and  REINSCK  [1962]);  (2)  Their  p-norms  are  unity  for 
all  (see  STOER  and  VfiTZGALL  [1962]);  and  (3)  Their  inverses— if  they  exist— 
have  row  and  column  sums  equal  to  unity  (though  the  inverse  of  a  doubly  sto¬ 
chastic  matrix  is  doubly  stochastic  only  for  permutation  matrices). 
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2.  Algorithms 

In  this  section,  we  will  describe  three  iterative  procedures  for  scaling  non¬ 
negative  matrices  to  doubly  stochastic  form.  In  section  three  we  show  that 
when  they  are  applied  to  a  matrix  with  total  support,  the  result  is  a  sequence  of 
iteration  matrices  converging  to  a  multiple  of  a  doubly  stochastic  matrix. 

Our  first  algorithm,  DEV,  was  motivated  by  the  desire  to  have  an  algorithm 
that  would  modify  a  single  row  or  column— leaving  the  remainder  of  the  matrix 
unchanged~at  each  iterative  step.  There  is  a  natural  way  to  select  the  row  or 
column  to  be  changed;  choose  one  whose  sum  deviates  maximally  from  the 
mean  of  the  row  sums  (which  is  also  the  meem  of  the  colunrm  sums).  This 
approach  is  reasonable,  because  matrices  with  equal  row  and  column  sums  are 
scalar  multiples  of  doubly  stochastic  matrices.  For  the  same  reason,  the 
natural  change  is  to  multiply  entries  in  the  selected  row  or  column  by  a  factor 
chosen  so  that  its  new  sum  will  be  the  new  mean  of  row  sums. 

JUgorithm  1  (called  DEV,  for  deviation  reduction) 

Given  A  =  4^°'  an  n  x  n  matrix; 

(1)  Compute  row  and  column  sums  for  A; 

n  <hj  i  =  1 . 

j=i 

cj  t,  (hj  3  =  1 . ^ 

i=\ 

Compute  the  mean,  ;u,  of  row  sums  in  A: 


(2)  Find  indices  p  and  q  so  that 

I  r„  -  u  :  =  max  '  r,  -  /x' 

I 

and 

=  max  '  c.  —  ; 

^  ) 

It  \rp  -  /^i:  <  tol  fj,  and  '  c,  ~  p.  <toL  p.  go  to  step  5. 
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If  ,  c,  -  p,  >  Tp  -  p  go  to  step  4. 

(3)  Calculate  the  mean  p,  of  row  sums  other  than  : 

1  f  n  ] 


1  ^ 


Scale  row  p  to  p: 


‘-’hi  r 


j  -  y,  •  .n 


Update  row  and  column  sums  for  A'. 


Tj,*-  tJL 


J  *-  Cj  +  -  l|  Opy  ;  =  1. 

»  tl 


Go  to  step  2. 

(4)  Calculate  the  mean,  p.  of  column  sums  other  than  c,: 


n  - 1  I ; 


Scale  column  q  to  p; 


*■  Oifl 


•  i  =  1,  .n 


Update  row  and  column  sums  for  A: 


c,  *-  p. 


r  j  ♦-  Tj  +  - 1 

c„ 


Go  to  step  2. 

(5)  Normalize: 


“y  *■  TThi  ij  =  1- 
M 


Exit. 
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Remarks 

Note  that  step  3  is  equivalent  to  premultiplying  matrix  A  by  a  positive  diag¬ 
onal  matrix: 

D  -  diag  (d,.  •  .d„) 


where 


di  = 


1  if  i  9^  p 
U  if  i  =  p 


and  step  4  is  equivalent  to  post  multipljdng  matrix  4  by  a  positive  diagonal 
matrix: 


E  =  diagiei,  ■  ■  .e„) 


where 


if  j  q 
If  j  =  g 


V/e  say  that  a  row  and  column  pair  in  a  nonnegative  matrix  is  balanced 
(with  respect  to  the  1-norm)  if  they  have  equal  sums.  Obviously,  all  row  and 
column  sums  in  a  multiple  of  a  doubly  stochastic  matrix  eure  balanced.  \  second 
approach  to  scaling  to  doubly  stochastic  form,  then,  is  to  find  a  row  and  a 
column  whose  sums  have  maximal  difference  and  to  scale  the  matrix  so  that 
their  sums  are  equal.  This  is  the  approach  taken  by  our  second  algorithm. 


Algorithm  2  (called  BAL,  for  balance) 

Given  A  =  4'°'  an  nxn  nonnegative  matrix: 

(l)  Compute  row  and  column  sums  for  4: 

i  =  1.  .n 

j=» 

n 

Cj  *-  V  aty  ;  =  1. 

1=1 


and  the  mean  of  row  sums: 


(2)  Find  indices  p  and  q  so  that; 


1  rp  -  c,  i  =  max  r^  -  c, 
i.j 


If  ITp  -  Cg  :  <  jLi  fai  go  to  step  5 

(3)  Balance  row  p  and  column  q: 
Multiply  entries  in  row  p  by: 


_  Cg  Op^ 
Tp  -  Opg 


and  multiply  entries  in  column  q  by  /  ^ 

(4)  Update  row  eind  column  sums: 


Ti  ^  +  (f  '  -  1)  opj  i  =  1.  .n 


°P<J 


Go  to  step  2. 

(5)  Normalize: 


Cj+(/-i)ap;  ;  =  1. 

1  n 


“ij  =  !•  ■  ■ 

r' 


Note  that  step  3  is  equivalent  to  forming  the  product  DAE  where; 

D  =  diag{d^,  .d„) 

d,  =  1- 

/ ,  if  i  =  p 

E  =  tixap(e,.  •  ,dn) 

Bj  =  1,  if  j  ?!  q , 
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=  if ;  =  g 

Now  for  the  third  method.  When  testing  DEV  we  found  cases  where  a 
sequence  of  10  or  more  iterations  were  alternately  scaling  the  same  row  and  the 
same  column.  Our  third  algorithm  is  a  variant  of  DEV  that  avoids  this  problem. 
It  records  the  last  row  and  last  column  scaled;  when  it  detects  a  repeat,  it  per¬ 
forms  a  balancing  step. 


Algorithm.  3  {called  EQ,  for  equalize) 

Given  A  =  am  nxn  nonnegative  matrix; 

(l)  Initialize; 


lastr  <-  0 

lastc  <-  0 

I  Oty  i  =  1, 

3-1 

,n 

t  ^3  ;  =  !• 

i  =  l 

,n 

M  -  -\^rA 

(2)  Find  Indices  p  and  q  so  that; 


iTp  -  /j,'  =  max  -  fx 


|c,  -m! 


max  !  Cj  -  jj,  . 


If  :  Tp  —  ju !  <  fi-  tal  and  ,  -  yu.1  <  fj,  tol  go  to  step  6. 

It  Tp  -  \  <  \Cq  —  ii  \  go  to  step  4. 

(3)  If  p  =  lastr  go  to  step  5.  Otherwise,  calculate  the  mean,  p,  of  row  sums 
other  than  Vp: 


Tl  — 1 


n 

V  - 

l'=‘ 

l.*p 


and  scede  row  p  to  p; 


■4 
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Ofcj  *-  “fc;  /  7  =  1. 

Multiply  entries  in  column  1  by  / 

*-  Oii  / "’  i  =  1,  .n 
Update  row  and  column  sums: 

n  •-  +  (/”'  -  l)ati  -1  =  1.  .n 

Update  row  and  column  sums: 


n  *-  +  {/  '  -  1)  “Ti  i  =  1. 


^  (’"fc  -a«)  + 


Go  to  2. 

(6)  Normalize: 


Exit. 


Natation 

To  simplify  the  descriptions  of  the  algorithms  we  have  omitted  program¬ 
ming  details.  In  pau-ticular,  we  have  assumed  that  eill  scaling  and  baleuicing 
operations  are  ceuried  out  explicitly  by  modif>'ing  entries  in  matrix  A-  In  the 
next  section,  it  will  be  convenient  to  assume  that  the  iterations  are  carried  out 
implicitly  by  changing  entries  in  a  pair  of  diagonal  matrices  D  and  E. 

Each  algorithm  produces  a  sequence  of  iteration  matrices  which  are  diago¬ 
nally  equivalent  to  the  starting  matrix  ,4  = 

(2.1)  k  =  1.2, 


=  diag{d['‘K  ■ 
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£■'*'  =  diag{e'^^\ 

eind  we  set  =  /. 

We  introduce  the  following  notation: 


(So,  for  example, 


=  t.  “i)*’ 

;  =  1 


c/*)  =  V  o^) 


and 


=  — 
71 


E  n:*> 

i  =  l 
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3.  Convergence 

In  this  section,  we  will  prove  that  when  a  starting  matrix  has  total  support 
each  of  the  algorithms  described  in  section  2  produces  a  sequence  of  iteration 
matrices  which  converges  to  a  diagonally  equivalent,  doubly  stochastic  limit. 

SINKKORN  and  KNOPP  [1967]  showed  that  when  SK  is  applied  to  an  nxn 
nonnegative  starting  matrix  =  /I  possessing  nonzero  row  and  column  sums, 
the  result  is  a  sequence  of  iteration  matrices  as  in  (2.1)  with  the  following  pro¬ 
perties: 

(Pi)  The  sequence  (  s*  )ifc  =  i.2.  monotonically  increasing  where: 

(3.1)  =  1.2. 

i-1 

(P2)  If  Urn  — - — =  1  then  fori.  7  =  1,  ,n: 

lim  r,'*''  =  1 

lim  c/**  =  1 

g  ifc  1) 

lim  '  =  1 

*  —  gj'  ' 

(P3)  With  the  fc*'*  mean  of  row  sums,  defined  by  (2.3) 

/i*  =  1  A:  =  1,2, 

Algorithms  which,  given  an  nxn  nonnegative  starting  matrix  ,  A,  produce  a 
sequence  of  iteration  matrices  as  in  (2.1)  satsifying  (PI).  (P2),  and  (P3),  will  be 
cadled  "diagonal  product  increasing"  (DPI).  The  following  result  is  a  simple  gen¬ 
eralization  of  the  convergence  theorem  in  SINKKORN  and  KNOPP  [1967]. 
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Theorem  I 

Given  a  sequence  (2.1)  of  diagonal  equivalents  for  A  satisfying  (PI).  (P2),  sind 
(P3): 

[1]  If  A  has  support  then  Urn  exists  and  is  doubly  stochastic. 

k  — > 

[2]  If  /4  has  total  support  then  the  limit  in  [l]  is  diagonally  equivalent  to  A. 
Before  proving  the  theorem  we  slate  and  prove  a  corollary; 

Corollary 

[1]  If  /4  is  diagonally  equivalent  to  a  doubly  stochastic  matrix,  5,  then; 

S  =  lim  4!*^ 

k  ->«> 

[2]  If  A  has  support  and  is  not  diagonally  equivalent  to  a  doubly  stochastic 
matrix,  then  for  each  pair  of  indices  (i,j)  such  that  0,^  does  not 

lie  on  a  positive  diagonal; 

lim  =  0 

Proof  of  Corollary: 

(1)  By  Hirkhon’s  Theorem  (IJIKKHOI'F  [  1946])  the  set  of  n  x  n  doubly  stochastic 
matrices  is  the  convex  huU  of  the  set  of  nxn  permutation  matrices. 
Therefore,  5  and  its  diagonal  equivalent,  A.  have  total  support.  Now  the 
theorem  implies  that  lim  A^'^^  is  doubly  stochastic  and  diagonally  equivalent 

k  -•«» 

to  A.  Since  doubly  stochastic  equivalents  are  unique,  (see  SI.NKHOR.N  and 
KNOPP  [1967]); 

lim  ,4^**  =  5 

(2)  By  the  theorem,  lim  .4'*'  is  doublv  stochastic,  so  it  has  totiil  support,  and 

fc  -••• 

lim  =  0  whenever  Oty  does  not  lie  on  a  positive  diagonal.  ■ 

Note  that  matrices  without  support  are  not  covered  by  the  preceding 
theorem  or  its  corollary.  Such  matrices  are  always  singular,  and  K.\i-.A,N  [private 
communication]  has  shown  that  the  sequence  of  iteration  matrices  (  4'*'  )  pro¬ 
duced  by  SK  cycles  for  such  a  starting  matrix. 


Proof  of  Theorem  1 ; 


We  will  need  the  following  well  known  result: 
Lemma  1  {The  Arithmetic  /  Geometric  '.tean  Inequality) 

If  ^  0  for  i  =  1,  ,n  then: 


n 


X,  =£ 


t  =  l 


with  equedity  only  when  x,  =  x-  =  •  ■  =  x„ 

(1):  (PI)  implies  (sfc)fc  =  i  2  is  monotonically  increasing.  Since  has  support, 
a  permutation,  o,  of  J 1,  ,n  {  exists  such  that; 


^  f.  i7l 


is  a  positive  diagonal  in  .A.  hi.-i  a  =  iiiui  ta,,a.(i)).  Then 


S  ^  s  fin'*’  =  n 

i=l  i=l  i=l  i=l 


(Property  (P3)  is  used  for  the  right  hand  equality)  By  the  arithmetic,  geometric 
inequality: 


=  fl  <  a- 

i  =  l 

and  is  bounded.  Therefore  by  (Pi) 


hm  Sfc  =  T  >  0 


exists,  and 


Sfc 

Itm  — =  1 


fc—  Sl. 


By  (P2): 


’  1*  *■  1 ) 


,lfcn) 


Urn  — r^. —  =  1  and  Urn  ■■  . —  =  1 

*..»  gy*) 


By  (P3),  since  the  >4'*'  are  nonnegative,  no  entry  can  be  larger  ihan  n. 
Therefore,  for  each  index  pair  (i,j)  the  sequence  ( oqj*')  is  Cauchy,  and 
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Urn 

k  — 

exists.  Since  the  row  and  column  sums  in  .4^”^  must  be: 

lim  =  Tj'*'  i  =  1.  .n 

*-••• 

lim  =  Cj'“'  J  =  1.  .n. 

(P2)  implies  that  4^“'  is  doubly  stochastic. 

(2):  To  prove  the  second  half  of  the  theorem  we  will  need  the  following  lemma 
which  is  paraphrased  from  SINKKORN  and  KNOPP  [1967],  p.  345. 

Lemma  2 

If  A  is  a  nonnegative  matrix  with  total  support,  and  (y/*')  are  positive 

sequences  fori  =  1,  ■  ,n  and 7  =  1,  ■  ,n  and: 

lim  =  L,,  >  0 

k-*m  ^  ' 

for  each  index  pair  (i,j)  such  that  ^  0,  then  there  exist  positive 
sequences  and  (y/*')  with  posuive  limits  such  that: 

ii'*''  y/*'  =  x»'*'yj  *'  for  all  i.  j,  and  k 
Now  for  the  proof.  From  part  (l).  we  know  that  lim  oa*'  =  lim 

k  -*»  ^  k  ^  ^ 

exists  for  euiy  i  and  j .  If  Oj,  0  then  lim  e/*'  exists.  Using  (Pi)  we  show  that 
this  limit  is  positive. 

If  Oij  0,  it  lies  on  a  positive  diagonal  in  4.  because  4  has  total  support.  Let 
(7  be  a  permutation  of  j  1 . nj  such  that: 

a  (i)  =j 

“i<7(i)  >  0  1  =  1.  .n 

By  (PI): 


(3.2) 


-1 


fc  =  1.2. 
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Let  a  =  min 
i.J 


cL,j  ;  a,j  0 


Then; 


i=l 


;*:) 


£  =  l 
l>»x 


i: 


i  =  l 


71—1  n-1 

Now  apply  the  arithmetlc/geometric  mean  Inequality: 

)n  '  I 


i=l 


n  a 


n  —  1 


or 


(3.3) 

-1 

> 

'll  s  1 

Combining  (3.2)  and  (3.3): 

(3.4)  ej'*' s  s  j 

f-i] 

n 

which  shows  that  lim  ctj''*' 

k 

ej^^  >  1 

jn  -  l|  n  ’ 


n  '  ! 


n-l 


>  0 


to  see  that  positive  sequences  and  positive  limits  exist  so  that : 

for  each  i.j ,  eind  k 
Set 

5**^  =  diag  ( 

and 

=  <iiag(ej'**) 


Urn  =  Z?'"’  and 

*  -«• 

lim 

*  — 


then 
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exist.  Taking  limits  on  both  sides  of 

£{k)  ~  ^(fc) 

we  obtain 

Naturally,  the  Sinkhorn/Knopp  method  is  product  increasing;  in  the  next 
theorem,  we  will  show  that  normalized  versions  of  DEV,  BAL,  and  EQ,  are  too. 
Here  is  another  example  of  a  DPI  algorithm  defined  for  irreducible,  nonnegalive 
matrices: 


Algorithm. 

At  each  step:  Normalize  the  rows  by  finding  Y,  a  positive  diagonal  matrix,  so 
that  has  row  sums  1.  Then  normalize  the  columns  by  a  diagonal  similarity 

transform  defined  as  follows: 

Let  I  =  (xj,  •  ,x„)  be  a  left  Perron  vector  for 

X  Y  A'‘‘^  =  I  I 

and  let /Y  =  diag(xi,  ■  ■  ■  .x„).  Then 

has  column  sums  1  because 

(1,  ■  ,  =  {1,  ,1) 

(Note  that  the  similarity  transform  leaves  diagonal  products  unchanged) 

Next  we  apply  Theorem  1  to  show  that  the  algorithms  described  in  section  2 
are  convergent  for  starting  matrices  mth  total  support. 

Theorem  2 

Suppose  that  the  sequence  of  iteration  matrices 

A:  =  1,  •  ,n 


results  from  the  application  of  DEV,  BAL,  or  EQ,  to  A  =  then  if  A  has 


j(fc) 

total  support,  lim - 


is  doubly  stochastic  and  diagonally  equivalent  to  A. 


Proof 


We  prove  Theorem  2  by  showing  that  the  sequence  of  normalized  iteration 
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By  the  arithmelLc/geometric  mean  Lnequaiity: 


,,n-l 


s  1 


Mi 


Therefore: 


Mfc 


n  -I 


1 


The  above  arguement  can  be  repeated  for  a  column  scaling  at  step  (k+1),  and 
(Pi)  holds.  Next  define  sequences: 

(3.8)  =  i  =  l.-.n 

by 

T 

if  i=p  and  at  step  (k+1)  rowp  is  scaled 
to  the  mean  of  the  other  row  sums 


if  i=q  and  at  step  (k+1)  column  q  is  scaled 
to  the  mean  of  the  other  column  sums 


otherwise 


By  (3.7),  -^2  a;/**  =  1  for  each  k.  Using  the  arithmetic/geometric 

^  t  =  i 

inequality  it  can  be  shown  that  from: 


mean 


lim  =  lim  =  1 

Ic  -*•  k 


follows: 


hm  =  1  1  =  1, 


Since 


_M*_ 


or 


d,'*’ 


.71 
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r.(fcM) 


and  sirmlau"ly 


for  some  I 


— T— — =  1  or  -  1  =  1. 

el*'  Cq 


,n 


=  1  or 


xlfc-i) 


for  some  £ 


it  follows  that: 


lim 


,>M) 


=  lim  Vn  =  1  •£  =  1. 

fc—  ej'*' 


At  each  step,  DEV  selects  p  or  q  so  that 


l2__  1,  0^  ■:  £2__  1 

Mk  Mfc 


is  maximal.  It  follows  that  fori,^'  =1,  ,n: 


■Ifc) 


lim 


-=  lim 


-j _ _ 


=  1 


fc—  /[ifc  fc—  Mfc 

Therefore  (P2)  holds  for  the  sequence  (3.5)  produced  by  DEV. 

BAL: 

Suppose  that  at  step  (k+1)  BAL  balances  row  p  and  column  q.  Let 

3-!*;)  —  ri*'  —  oJ*' 


'P  “pV 

-g  ^  -  'Vg' 


,(*)  =  ci*'  — 


In  this  case: 


(3.9) 


where 


1  /  Mfcn 


s*  [  1  /  Mfc 


/  /“  = 


Mt  +  i 


/  = 


,;*) 


1/2 


so  to  show  that  s*  n  &  s* ,  we  must  show  that  /u^  ^  s;  ^4* . 
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r'k) 

lim  ^ 

fc—  IJ.IC 


!im 

k  — 


Mfc 


Equation  (3.4)  in  the  proof  of  Theorem  1  holds  whenever  (PI)  and  (P3)  are 
satisfied,  and  for  each  index  pair  (i.j)  such  that  a^j  ^  0,  the  sequence 


Mfc 


fc  =  1.2. 


is  bounded  away  from  zero.  Therefore,  the  sequences 


-Ik) 


Aik 


and 


/ik) 


Aik 


bounded  away  from  zero,  because  i'*)  jk  0  and  ^  0  for  each  k.  We  have; 


(3.17) 


lim  —7^, — ^ - =  lim  7^^ 

y'*V/Zfc  k--  y>*' 


=  1 


Finally,  for  each  i,j,  and  k: 
(^(k  +  l) 

Aifci-i 


_  Mk 

M*  i-1 


or 


5,1* 


i(k) 


Aik 


■r[k) 


g^ffe-i-l) 


,  fkH) 


e,-'' 


•  Ik)  p 


•  .  ,vkl  I 
) 


Therefore,  by  (3.15)  and  (3.17): 


(3.  IS) 


w.(fc  +  i) 


r  ^ 


.ffc*-!) 


•=  lim 


-=  1 


for  each  i  and  j.  and  (P2)  is  satisfied. 

EQ-. 

Each  step  of  EQ  is  a  step  of  DEV  or  a  balancing  step.  The  arguements  above 
for  DEV  and  BAL  show  that  for  each  fc.  s*,.]  >  le.  that  (Pi)  holds.  Consider 
the  sequence  (3.8),  and  its  subsequence 


-  Pi-Pz- 


where  at  steps  fc'  -Pi.pz,  EQ  scaled  a  row  or  column  to  the  mean  of  the 

other  row  or  column  sums.  This  sequence  must  be  infinite—because  lastr  eind 
lastc  are  set  to  0  after  each  balancing  step— and  the  arguement  for  DEV  can  be 
repeated  to  show  that: 


(3.19)  for  i,j  =  1,  ,n 
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4.  Test  Results 

We  ran  comparison  tests  of  the  algorithms  described  in  this  paper  and  the 
Sinkhorn,/Knopp  method  on  a  collection  of  50  10x10  or  smaller  m.atrices. 
These  tests  were  run  on  a  VAX  11/730  at  UC.  Berkeley,  with  7  significant  digits  m 
single  precision  and  16  digits  in  double  precision.  Sums  vvere  accumulated  in 
double  precision. 

For  convergence  to  "tol"  accuracy,  we  required  that  ail  row  atnd  column 
sums  deviate  from  the  mean,  jj.,  by  less  than  toL  ii.  So,  in  the  normalized 
matrices,  row  and  column  sums  could  not  deviate  from  1  by'  more  than  tol. 

The  examples  in  this  section  were  selected  to  illustrate  the  following  points: 

(1)  EQ  exhibited  significantly  better  average  and  worst-case  behavior  than  SK. 
On  our  test  bed,  for  convergence  to  tol  =  lO"’^,  the  ratio  of  total  SK  opera¬ 
tions  to  total  EQ  operations  varied  from  a  low  of  l/'2  to  a  high  of  more  than 
130. 

(2)  We  found  striking  examples  where  EQ  was  significantly  faster  than  DEV.  BAL, 
or  SK.  Since  each  iteration  by  EQ  scaled  a  row  or  cn'iimn-like  DEV-or  bal¬ 
anced  a  row  and  column  pair-like  B.AL— there  is  evidence  that  some 
mechanism  is  at  work  which  c.nables  EQ  :o  choose  ti:.,  rig.ht  operat;o;i  th.. 
right  time. 

To  facilitate  comparing  DEV,  BAL,  and  EQ.  we  counted  their  "steps  "  in  the 
following  way:  each  scaling  of  a  row  or  column  counted  as  1  step,  aind  each 
balancing  of  a  row/column  pair  counted  as  two  steps.  In  this  way,  the  operation 
cost  (where  an  operation  is  a  multiplication  or  division)  was  the  same  for  each 
step  of  each  of  the  three  algorithms. 

To  facilitate  comparing  EQ  and  SK  we  computed  the  approximate  ratio  of 
total  operations  performed  by  EQ  to  total  operations  performed  by  SK. 

These  first  four  examples  were  the  test  matrices  in  MARSH.ALL  and  OLKIN 
[1968]: 


loho^io^ 

10®  1  0 

A  = 

10®  1  1 

B  = 

10®  10®  1 

10®  1  1 

0  10® 10®  j 

10®  10®  0 

i  10“  1  0 

C  = 

10210“  1 

D  = 

10“  10®  I 

0  1  10® 

[  0  10“  10“ 
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STEPS  TO  CONVERGENCE  FOR  MATRIX  .4 


TOL 

DEV 

BAL 

EQ 

10-2 

2 

18 

2 

10-2 

2 

20 

2 

lo-* 

2 

26 

2 

10-5 

2 

32 

2 

STEPS  TO 

CONVERGENC 

TOL 

DEV 

BAL 

EQ 

10-2 

102 

13 

21 

10-2 

201 

24 

26 

10-* 

302 

34 

37 

10-5 

402 

34 

46 

SK  OPERATIONS 
EQ  OPERATIONS 

1  .9 

1  .9 

1  .9 

1  .9 

FOR  MATRIX  B 

SK  OPERATIONS 
EQ  OPERATIONS 


38 

3.3 

75 

5.3 

113 

5.6 

150 

6.0 
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STEPS  TO  CONVERGENCE  FOR  .VLA.TRIX  C 


TOL 

DEV 

BAL 

EQ 

SK 

S/ST  OPERATIONS 
EQ  OPERATIONS 

10-2 

24 

18 

11 

10 

1.7 

10-2 

90 

24 

19 

307 

29.8 

10-* 

1656 

30 

39 

1092 

53.1 

10-® 

3231 

34 

49 

1899 

71.5 

STEPS  TO  CONVERGENCE  FOR  MATRIX  D 

TOL 

DEV 

EQ 

SK 

1  SK  OPERATIONS 
[  EQ  OPERATIONS 

10*2 

231 

22 

21 

94 

7.4 

10-2 

1895 

26 

32 

706 

40.7 

10-“ 

4891 

32 

34 

1830 

99.4 

10-2 

7961 

40 

40 

2963 

137.7 
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Our  next  examples,  R  and  S.  are  (5  x  5)  matrices.  They  illustrate  the  curi 
ous  fact  mentioned  in  (2)  above. 


R 


100  1  0  0  0 

0  200  1  0  0 

0  0  300  1  0 

0  0  0  400  1 

1  0  0  0  500 


STEPS  TO  CONVERGENCE  FOR  MATRIX  R 


TOL 

DEV 

BAL 

EQ 

SK 

\  SK  OPERA.T!OKs\ 
[  EQ  operations] 

10-2 

10 

14 

9 

1 

.4 

10-2 

115 

23 

31 

214 

24.4 

o 

1 

1682 

2474 

51 

630 

43.6 

10-® 

3912 

4036 

68 

1067 

55.4 
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’  40  0  1  1  1  1 

1  80  0  1  1  I 

5  =  1  1  120  0  1 

1  1  1  160  0 

I,  0  1  I  1  200  J 


STEPS  TO  CONVERGENCE  FOR  MATRIX  5 


TOL 

DEV 

BAL 

EQ 

SK 

SK  operations] 
EQ  operations] 

10-2 

39 

20 

16 

15 

3.3 

10-2 

206 

100 

29 

53 

6.5 

10-^ 

384 

108 

47 

94 

7.1 

10-2 

570 

198 

62 

136 

7.7 
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Our  final  five  examples  are  (10  x  10)  upper  Hessenberg  matrices: 

ffi  ~  (/>ai)  uhere  t\,  - 
‘  ^1  otherwisp. 

Hz,  Hz,  and  H^,  each  differ  from  //,  in  a  single  entry: 
the  (1,1)  entry  in  Hz  is  100 
the  (1,2)  entry  in  Hz  is  100 
the  (1,3)  entry  in  H^  is  100 

Hz  is  the  result  of  replacing  all  diagonal  entries  in  Hi  by  100. 


Here  is  a  summary  of  the  results  for  tal  =  10“®: 

STEPS  TO  CONVERGENCE  FOR  .llArRICEb  r/;.  i  =  1,  •  ,5 


MATRIX 

DEV 

BAL 

EQ 

SK 

1  SK  OPERATIONS 
[  EQ  OPERATIONS 

Hi 

812 

748 

812 

55 

.6 

Hz 

873 

926 

717 

72 

.8 

Hz 

925 

952 

775 

71 

.7 

H, 

953 

948 

921 

71 

.6 

Hz 

14476 

17458 

917 

1004 

8.9 
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y^pendix:  Sample  Iteration  matrices  for  EQ  and  SK 


The  following  pages  contain  sample  iteration  matrices  for  EQ  and  SK  when 
applied  to  matrices  C  and  D.  Normalized  iteration  matrices.  DAE.  are  printed 
with  their  row  and  column  sums  and  deviations. 
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